Overview

Dataset statistics

Number of variables14
Number of observations18249
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.9 MiB
Average record size in memory112.0 B

Variable types

NUM10
CAT4

Warnings

Date has a high cardinality: 169 distinct values High cardinality
region has a high cardinality: 54 distinct values High cardinality
4046 is highly correlated with Total Volume and 3 other fieldsHigh correlation
Total Volume is highly correlated with 4046 and 3 other fieldsHigh correlation
4225 is highly correlated with Total Volume and 3 other fieldsHigh correlation
Total Bags is highly correlated with Total Volume and 4 other fieldsHigh correlation
Small Bags is highly correlated with Total Volume and 4 other fieldsHigh correlation
Large Bags is highly correlated with Total Bags and 1 other fieldsHigh correlation
Date is uniformly distributed Uniform
region is uniformly distributed Uniform
df_index has 432 (2.4%) zeros Zeros
4046 has 242 (1.3%) zeros Zeros
4770 has 5497 (30.1%) zeros Zeros
Large Bags has 2370 (13.0%) zeros Zeros
XLarge Bags has 12048 (66.0%) zeros Zeros

Reproduction

Analysis started2020-09-14 16:08:34.735821
Analysis finished2020-09-14 16:08:59.745843
Duration25.01 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

ZEROS

Distinct53
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24.2322319
Minimum0
Maximum52
Zeros432
Zeros (%)2.4%
Memory size142.6 KiB
2020-09-14T21:38:59.941179image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q110
median24
Q338
95-th percentile49
Maximum52
Range52
Interquartile range (IQR)28

Descriptive statistics

Standard deviation15.48104475
Coefficient of variation (CV)0.6388616953
Kurtosis-1.254364272
Mean24.2322319
Median Absolute Deviation (MAD)14
Skewness0.1083337271
Sum442214
Variance239.6627467
MonotocityNot monotonic
2020-09-14T21:39:00.171891image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
74322.4%
 
114322.4%
 
14322.4%
 
24322.4%
 
34322.4%
 
44322.4%
 
54322.4%
 
64322.4%
 
84322.4%
 
94322.4%
 
Other values (43)1392976.3%
 
ValueCountFrequency (%) 
04322.4%
 
14322.4%
 
24322.4%
 
34322.4%
 
44322.4%
 
ValueCountFrequency (%) 
521070.6%
 
513221.8%
 
503241.8%
 
493241.8%
 
483241.8%
 

Date
Categorical

HIGH CARDINALITY
UNIFORM

Distinct169
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size142.6 KiB
2016-02-14
 
108
2017-04-02
 
108
2017-11-12
 
108
2018-01-28
 
108
2017-09-10
 
108
Other values (164)
17709 
ValueCountFrequency (%) 
2016-02-141080.6%
 
2017-04-021080.6%
 
2017-11-121080.6%
 
2018-01-281080.6%
 
2017-09-101080.6%
 
2015-01-181080.6%
 
2015-08-161080.6%
 
2016-12-041080.6%
 
2015-05-311080.6%
 
2016-08-281080.6%
 
Other values (159)1716994.1%
 
2020-09-14T21:39:00.420228image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-14T21:39:00.629244image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length10
Median length10
Mean length10
Min length10

AveragePrice
Real number (ℝ≥0)

Distinct259
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.40597841
Minimum0.44
Maximum3.25
Zeros0
Zeros (%)0.0%
Memory size142.6 KiB
2020-09-14T21:39:00.816063image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0.44
5-th percentile0.83
Q11.1
median1.37
Q31.66
95-th percentile2.11
Maximum3.25
Range2.81
Interquartile range (IQR)0.56

Descriptive statistics

Standard deviation0.4026765555
Coefficient of variation (CV)0.2864030861
Kurtosis0.3251958507
Mean1.40597841
Median Absolute Deviation (MAD)0.28
Skewness0.5803027379
Sum25657.7
Variance0.1621484083
MonotocityNot monotonic
2020-09-14T21:39:01.017340image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1.152021.1%
 
1.181991.1%
 
1.081941.1%
 
1.261931.1%
 
1.131921.1%
 
0.981891.0%
 
1.191881.0%
 
1.361871.0%
 
1.591861.0%
 
0.991851.0%
 
Other values (249)1633489.5%
 
ValueCountFrequency (%) 
0.441< 0.1%
 
0.461< 0.1%
 
0.481< 0.1%
 
0.492< 0.1%
 
0.515< 0.1%
 
ValueCountFrequency (%) 
3.251< 0.1%
 
3.171< 0.1%
 
3.121< 0.1%
 
3.051< 0.1%
 
3.041< 0.1%
 

Total Volume
Real number (ℝ≥0)

HIGH CORRELATION

Distinct18237
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean850644.013
Minimum84.56
Maximum62505646.52
Zeros0
Zeros (%)0.0%
Memory size142.6 KiB
2020-09-14T21:39:01.245145image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum84.56
5-th percentile2371.862
Q110838.58
median107376.76
Q3432962.29
95-th percentile3716315.41
Maximum62505646.52
Range62505561.96
Interquartile range (IQR)422123.71

Descriptive statistics

Standard deviation3453545.355
Coefficient of variation (CV)4.059918488
Kurtosis92.10445778
Mean850644.013
Median Absolute Deviation (MAD)102962.47
Skewness9.007687479
Sum1.552340259e+10
Variance1.192697552e+13
MonotocityNot monotonic
2020-09-14T21:39:01.463727image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
3713.492< 0.1%
 
3529.442< 0.1%
 
2038.992< 0.1%
 
569349.052< 0.1%
 
4103.972< 0.1%
 
9465.992< 0.1%
 
46602.162< 0.1%
 
2858.312< 0.1%
 
7223.462< 0.1%
 
19634.242< 0.1%
 
Other values (18227)1822999.9%
 
ValueCountFrequency (%) 
84.561< 0.1%
 
379.821< 0.1%
 
385.551< 0.1%
 
419.981< 0.1%
 
472.821< 0.1%
 
ValueCountFrequency (%) 
62505646.521< 0.1%
 
61034457.11< 0.1%
 
52288697.891< 0.1%
 
47293921.61< 0.1%
 
46324529.71< 0.1%
 

4046
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct17702
Distinct (%)97.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean293008.4245
Minimum0
Maximum22743616.17
Zeros242
Zeros (%)1.3%
Memory size142.6 KiB
2020-09-14T21:39:01.707069image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile19.6
Q1854.07
median8645.3
Q3111020.2
95-th percentile1263359.678
Maximum22743616.17
Range22743616.17
Interquartile range (IQR)110166.13

Descriptive statistics

Standard deviation1264989.082
Coefficient of variation (CV)4.317244747
Kurtosis86.80911256
Mean293008.4245
Median Absolute Deviation (MAD)8616.69
Skewness8.648219757
Sum5347110739
Variance1.600197377e+12
MonotocityNot monotonic
2020-09-14T21:39:01.924814image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
02421.3%
 
3100.1%
 
1.248< 0.1%
 
18< 0.1%
 
48< 0.1%
 
1.257< 0.1%
 
67< 0.1%
 
1.216< 0.1%
 
2.545< 0.1%
 
1.275< 0.1%
 
Other values (17692)1794398.3%
 
ValueCountFrequency (%) 
02421.3%
 
18< 0.1%
 
1.131< 0.1%
 
1.193< 0.1%
 
1.21< 0.1%
 
ValueCountFrequency (%) 
22743616.171< 0.1%
 
21620180.91< 0.1%
 
18933038.041< 0.1%
 
17787611.931< 0.1%
 
17076650.821< 0.1%
 

4225
Real number (ℝ≥0)

HIGH CORRELATION

Distinct18103
Distinct (%)99.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean295154.5684
Minimum0
Maximum20470572.61
Zeros61
Zeros (%)0.3%
Memory size142.6 KiB
2020-09-14T21:39:02.154122image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile103.614
Q13008.78
median29061.02
Q3150206.86
95-th percentile1303657.658
Maximum20470572.61
Range20470572.61
Interquartile range (IQR)147198.08

Descriptive statistics

Standard deviation1204120.401
Coefficient of variation (CV)4.079626508
Kurtosis91.94902197
Mean295154.5684
Median Absolute Deviation (MAD)28521.3
Skewness8.942465608
Sum5386275718
Variance1.44990594e+12
MonotocityNot monotonic
2020-09-14T21:39:02.364691image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0610.3%
 
215.363< 0.1%
 
177.873< 0.1%
 
1.33< 0.1%
 
94.743< 0.1%
 
1.263< 0.1%
 
3478.972< 0.1%
 
61.012< 0.1%
 
65.222< 0.1%
 
5.732< 0.1%
 
Other values (18093)1816599.5%
 
ValueCountFrequency (%) 
0610.3%
 
1.263< 0.1%
 
1.282< 0.1%
 
1.33< 0.1%
 
1.311< 0.1%
 
ValueCountFrequency (%) 
20470572.611< 0.1%
 
20445501.031< 0.1%
 
20328161.551< 0.1%
 
18956479.741< 0.1%
 
17896391.61< 0.1%
 

4770
Real number (ℝ≥0)

ZEROS

Distinct12071
Distinct (%)66.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22839.73599
Minimum0
Maximum2546439.11
Zeros5497
Zeros (%)30.1%
Memory size142.6 KiB
2020-09-14T21:39:02.590326image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median184.99
Q36243.42
95-th percentile106156.574
Maximum2546439.11
Range2546439.11
Interquartile range (IQR)6243.42

Descriptive statistics

Standard deviation107464.0684
Coefficient of variation (CV)4.705136192
Kurtosis132.5634409
Mean22839.73599
Median Absolute Deviation (MAD)184.99
Skewness10.15939563
Sum416802342.1
Variance1.1548526e+10
MonotocityNot monotonic
2020-09-14T21:39:02.820495image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0549730.1%
 
2.667< 0.1%
 
3.327< 0.1%
 
1.646< 0.1%
 
10.976< 0.1%
 
1.66< 0.1%
 
1.596< 0.1%
 
2.745< 0.1%
 
1.655< 0.1%
 
1.635< 0.1%
 
Other values (12061)1269969.6%
 
ValueCountFrequency (%) 
0549730.1%
 
0.831< 0.1%
 
13< 0.1%
 
1.011< 0.1%
 
1.091< 0.1%
 
ValueCountFrequency (%) 
2546439.111< 0.1%
 
1993645.361< 0.1%
 
1896149.51< 0.1%
 
1880231.381< 0.1%
 
1811090.711< 0.1%
 

Total Bags
Real number (ℝ≥0)

HIGH CORRELATION

Distinct18097
Distinct (%)99.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean239639.2021
Minimum0
Maximum19373134.37
Zeros15
Zeros (%)0.1%
Memory size142.6 KiB
2020-09-14T21:39:03.056312image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile628.89
Q15088.64
median39743.83
Q3110783.37
95-th percentile1005478.892
Maximum19373134.37
Range19373134.37
Interquartile range (IQR)105694.73

Descriptive statistics

Standard deviation986242.3992
Coefficient of variation (CV)4.115530309
Kurtosis112.2721565
Mean239639.2021
Median Absolute Deviation (MAD)37299.96
Skewness9.75607167
Sum4373175798
Variance9.7267407e+11
MonotocityNot monotonic
2020-09-14T21:39:03.286545image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0150.1%
 
3005< 0.1%
 
9905< 0.1%
 
916.674< 0.1%
 
266.674< 0.1%
 
5504< 0.1%
 
856.673< 0.1%
 
153.333< 0.1%
 
196.673< 0.1%
 
803.333< 0.1%
 
Other values (18087)1820099.7%
 
ValueCountFrequency (%) 
0150.1%
 
3.091< 0.1%
 
3.111< 0.1%
 
3.191< 0.1%
 
3.331< 0.1%
 
ValueCountFrequency (%) 
19373134.371< 0.1%
 
16394524.111< 0.1%
 
16298296.291< 0.1%
 
15972492.071< 0.1%
 
15804696.311< 0.1%
 

Small Bags
Real number (ℝ≥0)

HIGH CORRELATION

Distinct17321
Distinct (%)94.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean182194.6867
Minimum0
Maximum13384586.8
Zeros159
Zeros (%)0.9%
Memory size142.6 KiB
2020-09-14T21:39:03.535723image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile256.67
Q12849.42
median26362.82
Q383337.67
95-th percentile768147.228
Maximum13384586.8
Range13384586.8
Interquartile range (IQR)80488.25

Descriptive statistics

Standard deviation746178.515
Coefficient of variation (CV)4.095500964
Kurtosis107.0128851
Mean182194.6867
Median Absolute Deviation (MAD)25599.49
Skewness9.540659982
Sum3324870838
Variance5.567823762e+11
MonotocityNot monotonic
2020-09-14T21:39:03.782029image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
01590.9%
 
203.33110.1%
 
533.33100.1%
 
223.33100.1%
 
103.338< 0.1%
 
326.678< 0.1%
 
3008< 0.1%
 
196.678< 0.1%
 
263.338< 0.1%
 
123.338< 0.1%
 
Other values (17311)1801198.7%
 
ValueCountFrequency (%) 
01590.9%
 
2.521< 0.1%
 
2.571< 0.1%
 
2.731< 0.1%
 
2.791< 0.1%
 
ValueCountFrequency (%) 
13384586.81< 0.1%
 
12567155.581< 0.1%
 
12540327.191< 0.1%
 
11712807.191< 0.1%
 
11392828.891< 0.1%
 

Large Bags
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct15082
Distinct (%)82.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54338.08814
Minimum0
Maximum5719096.61
Zeros2370
Zeros (%)13.0%
Memory size142.6 KiB
2020-09-14T21:39:04.032246image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1127.47
median2647.71
Q322029.25
95-th percentile195699.768
Maximum5719096.61
Range5719096.61
Interquartile range (IQR)21901.78

Descriptive statistics

Standard deviation243965.9645
Coefficient of variation (CV)4.489778218
Kurtosis117.999481
Mean54338.08814
Median Absolute Deviation (MAD)2647.71
Skewness9.796454599
Sum991615770.6
Variance5.951939186e+10
MonotocityNot monotonic
2020-09-14T21:39:04.259210image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0237013.0%
 
3.331871.0%
 
6.67780.4%
 
10470.3%
 
4.44380.2%
 
13.33280.2%
 
16.67180.1%
 
6.66180.1%
 
26.67180.1%
 
20140.1%
 
Other values (15072)1543384.6%
 
ValueCountFrequency (%) 
0237013.0%
 
0.971< 0.1%
 
1.31< 0.1%
 
1.331< 0.1%
 
1.382< 0.1%
 
ValueCountFrequency (%) 
5719096.611< 0.1%
 
4324231.191< 0.1%
 
4081397.721< 0.1%
 
4023485.041< 0.1%
 
3988101.741< 0.1%
 

XLarge Bags
Real number (ℝ≥0)

ZEROS

Distinct5588
Distinct (%)30.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3106.426507
Minimum0
Maximum551693.65
Zeros12048
Zeros (%)66.0%
Memory size142.6 KiB
2020-09-14T21:39:04.755078image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3132.5
95-th percentile12058.452
Maximum551693.65
Range551693.65
Interquartile range (IQR)132.5

Descriptive statistics

Standard deviation17692.89465
Coefficient of variation (CV)5.695578058
Kurtosis233.6026119
Mean3106.426507
Median Absolute Deviation (MAD)0
Skewness13.13975069
Sum56689177.33
Variance313038521.2
MonotocityNot monotonic
2020-09-14T21:39:04.957911image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
01204866.0%
 
3.33290.2%
 
6.67160.1%
 
1.11150.1%
 
5120.1%
 
109< 0.1%
 
16.678< 0.1%
 
2.227< 0.1%
 
1506< 0.1%
 
806< 0.1%
 
Other values (5578)609333.4%
 
ValueCountFrequency (%) 
01204866.0%
 
11< 0.1%
 
1.11150.1%
 
1.261< 0.1%
 
1.31< 0.1%
 
ValueCountFrequency (%) 
551693.651< 0.1%
 
454343.651< 0.1%
 
390478.731< 0.1%
 
387400.221< 0.1%
 
377661.061< 0.1%
 

type
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size142.6 KiB
conventional
9126 
organic
9123 
ValueCountFrequency (%) 
conventional912650.0%
 
organic912350.0%
 
2020-09-14T21:39:05.174857image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-14T21:39:05.293407image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:39:05.412611image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length12
Median length12
Mean length9.500410981
Min length7

year
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size142.6 KiB
2017
5722 
2016
5616 
2015
5615 
2018
1296 
ValueCountFrequency (%) 
2017572231.4%
 
2016561630.8%
 
2015561530.8%
 
201812967.1%
 
2020-09-14T21:39:05.573265image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-14T21:39:05.680133image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:39:05.821512image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length4
Median length4
Mean length4
Min length4

region
Categorical

HIGH CARDINALITY
UNIFORM

Distinct54
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size142.6 KiB
Albany
 
338
Philadelphia
 
338
Pittsburgh
 
338
Northeast
 
338
Indianapolis
 
338
Other values (49)
16559 
ValueCountFrequency (%) 
Albany3381.9%
 
Philadelphia3381.9%
 
Pittsburgh3381.9%
 
Northeast3381.9%
 
Indianapolis3381.9%
 
NewOrleansMobile3381.9%
 
Sacramento3381.9%
 
Orlando3381.9%
 
Atlanta3381.9%
 
Tampa3381.9%
 
Other values (44)1486981.5%
 
2020-09-14T21:39:06.036432image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-14T21:39:06.265966image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length19
Median length9
Mean length10.29535865
Min length4

Interactions

2020-09-14T21:38:39.671013image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:39.873726image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:40.044897image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:40.233265image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:40.401111image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:40.580455image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:40.764635image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:40.953372image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:41.140073image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:41.316962image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:41.486339image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:41.667272image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:41.844804image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:42.041053image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:42.223178image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:42.391708image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:42.568305image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:42.758357image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:42.954325image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:43.136072image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:43.311253image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:43.490799image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:43.681704image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:43.876320image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:44.186861image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:44.377564image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:44.567757image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:44.777834image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:44.979095image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:45.175902image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:45.370719image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:45.543567image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:45.719367image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:45.893343image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:46.060324image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:46.231619image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:46.399863image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:46.608212image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:46.792978image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:46.975942image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:47.147753image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:47.325747image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:47.495986image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:47.676145image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:47.841850image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:48.013734image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:48.190383image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:48.383273image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:48.569445image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:48.740182image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:48.914088image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:49.087828image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:49.260431image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:49.437015image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:49.619785image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:49.790866image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:49.974252image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:50.169883image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:50.496601image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:50.685349image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:50.865349image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:51.077264image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:51.276968image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:51.479628image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:51.681429image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:51.877749image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:52.078720image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:52.298163image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:52.503465image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:52.722635image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:52.922378image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:53.113437image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:53.300548image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:53.495805image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:53.683343image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:53.882598image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:54.089760image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:54.291664image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:54.498357image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:54.702177image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:54.898572image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:55.074843image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:55.254703image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:55.443954image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:55.628550image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:55.805112image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:55.994156image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:56.195479image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:56.388778image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:56.598144image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:56.784772image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:56.967760image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:57.153157image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:57.341629image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:57.508825image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:57.684198image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:57.866118image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:58.077349image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:58.439908image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:58.630760image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Correlations

2020-09-14T21:39:06.445971image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-09-14T21:39:06.739158image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-09-14T21:39:07.017649image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-09-14T21:39:07.308760image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-09-14T21:39:07.565909image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-09-14T21:38:59.016933image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-14T21:38:59.500384image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Sample

First rows

df_indexDateAveragePriceTotal Volume404642254770Total BagsSmall BagsLarge BagsXLarge Bagstypeyearregion
002015-12-271.3364236.621036.7454454.8548.168696.878603.6293.250.0conventional2015Albany
112015-12-201.3554876.98674.2844638.8158.339505.569408.0797.490.0conventional2015Albany
222015-12-130.93118220.22794.70109149.67130.508145.358042.21103.140.0conventional2015Albany
332015-12-061.0878992.151132.0071976.4172.585811.165677.40133.760.0conventional2015Albany
442015-11-291.2851039.60941.4843838.3975.786183.955986.26197.690.0conventional2015Albany
552015-11-221.2655979.781184.2748067.9943.616683.916556.47127.440.0conventional2015Albany
662015-11-150.9983453.761368.9273672.7293.268318.868196.81122.050.0conventional2015Albany
772015-11-080.98109428.33703.75101815.3680.006829.226266.85562.370.0conventional2015Albany
882015-11-011.0299811.421022.1587315.5785.3411388.3611104.53283.830.0conventional2015Albany
992015-10-251.0774338.76842.4064757.44113.008625.928061.47564.450.0conventional2015Albany

Last rows

df_indexDateAveragePriceTotal Volume404642254770Total BagsSmall BagsLarge BagsXLarge Bagstypeyearregion
1823922018-03-111.5622128.422162.673194.258.9316762.5716510.32252.250.0organic2018WestTexNewMexico
1824032018-03-041.5417393.301832.241905.570.0013655.4913401.93253.560.0organic2018WestTexNewMexico
1824142018-02-251.5718421.241974.262482.650.0013964.3313698.27266.060.0organic2018WestTexNewMexico
1824252018-02-181.5617597.121892.051928.360.0013776.7113553.53223.180.0organic2018WestTexNewMexico
1824362018-02-111.5715986.171924.281368.320.0012693.5712437.35256.220.0organic2018WestTexNewMexico
1824472018-02-041.6317074.832046.961529.200.0013498.6713066.82431.850.0organic2018WestTexNewMexico
1824582018-01-281.7113888.041191.703431.500.009264.848940.04324.800.0organic2018WestTexNewMexico
1824692018-01-211.8713766.761191.922452.79727.949394.119351.8042.310.0organic2018WestTexNewMexico
18247102018-01-141.9316205.221527.632981.04727.0110969.5410919.5450.000.0organic2018WestTexNewMexico
18248112018-01-071.6217489.582894.772356.13224.5312014.1511988.1426.010.0organic2018WestTexNewMexico